Biostar

50,014 results • Page 2 of 1001

trying to use RSeQC for identifying strandness but I am getting error as `Could not retrieve index file` Code infer_experiment.py --i test-sorted.bam -r output.bed12 Output: ``` [E::idx_find_and_load] Could not retrieve index...file for 'test-sorted.bam' Reading reference gene model output.bed12 ... Done Loading SAM/BAM file ... Total 200000 usable reads

RSeQC RNA-Seq

updated 5 days ago • Prawesh

Hello, I run kallisto on my data and I am in the process of assigning gene names to my data. I tried to do this in 2 different ways but I get different results. The first way I tried is shown below using the t2g.py from https://github.com/pachterlab/kallisto-transcriptome-indices/releases: #Create the transcripts_to_genes file python t2g.py --use_version <homo_sapiens.grch38…

biomart RNAseq kallisto

updated 5 days ago • bioinfo

If I want to see if my transcription factor chipseq correlate with a histone mark chipseq, is this a common practise: bamcompare to get the bigwig file with fold enrichment and then use bigwigsummary and plotcorrelation? But this method gave me very low correlation coefficient...correlate with a histone mark chipseq, is this a common practise: bamcompare to get the bigwig file with fold enrichm…

chipseq bigwigsummary deeptools correlation

updated 6 days ago • Emily

I have previously used the biomart webportal to dow nload fastas for the 3'utrs of a gene-stable ensemble id list. Typically I limit my output to "MANE Select" as I am trying to get just one

utr biomart

updated 6 days ago • RNAseqer

Hello everyone, I have annotation file like this ``` less -S Sars_cov_2.ASM985889v3.101.gtf | head -20 #!genome-build ASM985889v3 #!genome-version ASM985889v3 #!genome...protein_version "1"; ``` I have a reference genome : sequence.fasta and a bam file : ILS_W_V_558_S2_R1_001_val.bam that looks like this : ``` samtools view ILS_W_V_558_S2_R1_001_val.bam | head NB551648...Parse …

WGS SARS-CoV2

updated 6 days ago • Adyasha

data and not change anything else. I have been able to write some R code that can do this to files in STRUCTURE format as that format is amenable to import into R. I am wondering if it is possible to do something similar...directly to the VCF file or to all the .bed,.fam,.bim files generated from PLINK from a VCF. I have looked into VCFtools and bcfTools but the merge funtions...and collate fu…

snp vcf genomics plink

updated 6 days ago • ajbarrett98

datasets and I keep getting this error while trying to generate the Gene and Transcript counts files. ``` Dataset Error Report An error occurred while running the tool toolshed.g2.bx.psu.edu/repos/iuc/stringtie/stringtie

stringtie galaxy

updated 6 days ago • trkfs

GT \ --out /data/small_CB_pN \ --sam-verbose 10000000 \ --vcf-verbose 100000 ``` (The .bam file comes from scRNA-seq data using a Parse Biosciences kit, hence the pN UMI tag.) When I run this command, I get a .single file and...empty .best and .sing2 files. I also get this message: terminate called after throwing an instance of 'std::bad_alloc' what(): std::bad_alloc I have read...th…

Biosciences Demuxlet

updated 6 days ago • eking28

degenerate ("4d") sites either from the individual genes (pre-allignment), or from the alignment files directly? It seems like this is fairly standard practice for a lot of phylogenomic analysis, going by the literature

alignments

updated 6 days ago • J.

of a plant species. I have completed the gene prediction using the Augustus pipeline. The output file is of format `.gff` . Now I want to perform the gene annotation by performing `BLAST` for which I need the coding sequences in...a `.fasta.` file. This is the method that I've thought of approaching. 2. Use the `perl` script `getAnnoFasta.pl` to get the `amino acid sequence...and later t…

augustus annotation assembly genome

updated 6 days ago • Vijith

command but I am getting empty seq_89.vcf.gz output Here is the command I used: Generate your bam file as usual samtools sort -@ 56 seq-89-highquality.bam -o seq-89-highquality.sorted.bam bcftools mpileup -Ou -f KT992094.1.fasta

consensus

updated 6 days ago • Ghada

Hey everyone, I am doing a lot of variant calling. So far, I have always used the Ensembl refgenomes with the "number only" nomenclature for the main chromosomes. My default workflow (very simplified) is: Map fastq to ensembl refgenome -> call variants -> annotate variants with VEP. I prefer to use the VEP cache over a gtf/gff files for annotation since this is recommended by …

Mutect2

updated 7 days ago • gernophil

flagstat -@ 20 $MAPPING/${i}.sorted.bam > $MAPPING/${i}.mapping_stats.txt", the output text file is empty. Please suggest a solution

samtools-flagstat

updated 7 days ago • ramendra.sarma

Hello everybody, I wanted to align some files against the reference genome using the following script: files="Chionobathyscus_dewitti_12 Chionobathyscus_dewitti_14...bwa_db=GCA_943594065.1_fChiDew1_genomic.fna.gz (# reference genome) for sample in $files do echo $sample bwa mem -t 2 $bwa_db ${sample}.1.fq.gz ${sample}.2.fq.gz | samtools view -b | samtools sort --threa…

Samtools bam

updated 7 days ago • Vahid

Hi, I am looking for a fasta file that contains mouse rRNA sequences, but I noticed that the links I searched on the internet point to some different

fasta mm10 rRNA

updated 7 days ago • octpus616

github.com/kevinblighe/PCAtools [2]: https://github.com/kevinblighe/PCAtools?tab=readme-ov-file#quick-start-deseq2 [3]: https://github.com/kevinblighe/PCAtools?tab=readme-ov-file#quick-start-gene-expression-omnibus

eigencorplot PCAtools deseq2 pca

updated 7 days ago • BioinfGuru

Hi there, I'm working on a joint call-set of 47 VCFs which I will be merging with `GLNexus`. Now, I've done this before but, for some reason, since I've added 2 extra samples to the original 45 – total 47 – there have been few issues. The original 45 samples are from the SGDP called with `UnifiedCaller` the 2 extra are archaic Neanderthal and Denisova, which have been called with the same pipe…

sort bcftools GLNexus merge VCF

updated 8 days ago • Matteo Ungaro

longest transcript variant per gene. Orthofinder provides a script for this but it only applies to files downloaded from Ensembl. Does anyone know a tool that can help me with this? Basically I have these files for each species...braker.aa braker.codingseq braker.gff3 ``` My protein file looks like this: ``` head -n 2000 braker.aa ``` ``` >g176.t1 MTKLTKRLELQMESSRLGLLRSHSRARSSKLASSQSKA…

transcript longest variant orthofinder

updated 8 days ago • sansan_96

counts_matrix <- GetAssayData(data, assay='SCT', slot='counts') writeMM(counts_matrix, file=paste0(file='/lustre1/project/stg_00079/students/soniya/seurat/matrixraw.mtx')) # write dimensional reduction matrix...PCA) write.csv (SRT@reductions$pca@cell.embeddings, file='/lustre1/project/stg_00079/students/soniya/seurat/pca.csv', quote=F, row.names=F) libr…

updated 9 days ago • beginner123

OUTPUT_DIR/${base_name}_2_unpaired_trimmed.fastq.gz" # Check if trimmed files already exist trimmed_files_exist=true for file in $trimmed_paired1 $trimmed_unpaired1 $trimmed_paired2 $trimmed_unpaired2...logic here else echo "Trimmed files for $base_name already exist. Skipping." fi …

GATK sentieon BWA-MEM

updated 9 days ago • melissachua90

100000000 -c 1 -endPlugin -runfork1 which seems to run fine. There are a couple errors in the log file. This one appears near the beginning, but it doesn't seem to stop the rest of the script from running: [SQLITE_ERROR] SQL error...number of low quality reads=19326 Timing process (sorting, collapsing, and writing TagCount to file). Process took 2855.360497 milliseconds. tagCntM…

GBS Tassel5

updated 9 days ago • meck

From [this post][1], I'm still struggling with GBS. I know it's pretty particular about file format, and so I'm wondering if there's something wrong with my files that I'm not seeing. First, here are a few of my file names

tassel fastq gbs

updated 9 days ago • meck

Hello everyone I am facing challenges with liftover of a VCF file from hg19 to hg38 using GATK because of 'I' and 'D' annotations representing insertions and deletions in the VCF file. Running...rejeceted_variants.vcf --RECOVER_SWAPPED_REF_ALT True Despite converted the VCF file to VCF 4.2 version using vcftools, I'm still having this issue. htsjdk.tribble.TribbleException: The provided…

gatk vcf liftover

updated 9 days ago • Omics data mining

Hello, I am using Practical Haplotype Graph v2.2.85.134 to build a pangenome graph using six diploid plant species (12 haplotypes). I was able to go through their [Build and Load module][1]. Then, I simulated 10 million WGS Illumina Novaseq paired end reads from two of my haplotypes (australasica_primary and fallglo_primary) and mapped it to the pangenome using their [Imputation module][2]. The i…

Pangenome PHG graph

updated 9 days ago • beantkapoor16

better to align to both genomes separately, allowing multimapping for spike-in, and then filter BAM files to exclude reads present in both of them. But I am not sure if I am missing anything, so any input or advice is welcome. Thanks

alignment multimapping RNA-seq ChIP-seq spike-in

updated 10 days ago • maria.soler

I got the haps file and sample file after pre-phasing with eagle2. After that, I tried to switch to a vcf file using SHAPEIT4, but it keeps saying...that there is no index file. How can I easily convert to a vcf file? Thank you

pre-phasing GWAS imputation

updated 10 days ago • SeoGyun

Hi, I'm trying to use PRS-CSx which requires SNP, A1, A2, Beta/OR and P value/SE. I want to use the individual study beta's instead of the random/fixed effect. Is there a way to calculate the p value/se for the individual study betas from this information? Columns in the file: CHR Chromosome code BP Basepair position SNP SNP identifier A1 Effect allele A2 Non-effect allele …

prscsx beta se pvalue

updated 10 days ago • curious_butterfly

dependencies are properly addressed. However, I am encountering challenges with organizing the input files, particularly with respect to arranging the BAM files in the accompanying text file. As part of my analysis, I possess...BAM files corresponding to different species along with their respective GTF files. My primary concern lies in structuring...the input file, particularly regarding the ap…

rmats

updated 10 days ago • Lambodarswain316

I'm searching for a long pattern in my fastq file using `bbduk.sh` and `seal.sh`. Both can't find it, even though can `grep` it. ``` $ grep --color=always CGAGTACCCT 10_ID_mRNA_S1_L002_R1_001.fastq

pattern bbduk java bbmap fastq

updated 10 days ago • Assa Yeroslaviz

Hi all, I have tfam and tped files from dog data. I need to convert these to map and ped files for a program I will be using. I've used the code ```plink --dog --tfile...including other variations such as ```ls -1 *.tped | sed 's/\_*.tped//'`; do plink --dog --tfile ${file} --recode --out ${file}_ ; done```. However, I keep getting my outputted .ped file thats just a bunch of nucleotides in a …

plink ped tfam tped map

updated 10 days ago • Samantha

study samples merged into one: 1000 WGS study samples + 2504 1000Genomes samples 2. Created a ".pop" file with "-" for study samples and one of the below listed ancestry for 1000Genome samples in the same order as in ".fam" file. 3. admixture...0.124 0.090 My question is how to assign the ancestry name to the output columns in the ".Q" file? Is this a sorted list of the 5 ancestry names from t…

supervised admixture

updated 10 days ago • RT

longest ORF in that identified sequence? Idenfity all repeats in a sequence for all sequences in the FASTA, along with how many times each repeat occurs and which is the most frequent repeat.” The primary problem I think I have...is that I don’t know how to reference the sequences inside a FASTA file beyond what I have already, so my has_codon section of code isn’t working like I think it should…

Python ORF FASTA Biopython

updated 10 days ago • cput

ANNOVAR to annotate my dog tumor Illumina whole genome sequenced DNA reads. It generated 3 output files: 1. `exonic.variant.function` 2. `variant_function`; and 3. `.log` My exonic variant function file has many unknown sites. Is

ANNOVAR

updated 10 days ago • sainavyav22

this issue downloading metadata tsv reports from ENA Portal API, it only happens with very large files (over 100 MB): they are saved to TSV before they are fully downloaded, without showing any error or warning. If I try to download...the same file from the browser it's fully downloaded, but it's not what I'm looking for. This is the code: ```py projectID = "PRJNA43021" s = rq.session...True) w…

ena python

updated 10 days ago • Giulia

Hello everyone, I'm calculating the length of reads from a BAM file to create plots later on. I haven't filtered out secondary and/or supplementary alignments, and I can't understand why...Hello everyone, I'm calculating the length of reads from a BAM file to create plots later on. I haven't filtered out secondary and/or supplementary alignments, and I can't understand why I

samtools BAM

updated 10 days ago • marco.barr

I am using rmats software, here I have settled all dependencies. Now I am facing problem in input file, mainly how to arrange bam file in txt file. I have different species bam and gtf files. There confusion in b1 and b2 during...run file if I have 1 species and 1 gtf file then how can add species to during running time. If I am wrong then guide me to make a correct...input file

rmats

updated 10 days ago • Lambodarswain316

Hi, I have some normalized BigWig files and now I want to convert these normalized BigWig files to count matrix. Can anyone give me any advice? I will appreciate

count-matrix BigWig

updated 10 days ago • feather-W

for that, I require their DNA sequencing reads. I believe I can obtain these reads from their CRAM files of the normal samples, so I downloaded a slice of the CRAM according to their instructions bin/score-client view --object...id 28358cf3-fba0-51a3-8b93-104bd5d48b23 --reference-file /home/victor/ref-fasta/GRCh38_full_analysis_set_plus_decoy_hla.fa --output-dir /media/victor/c1d5c312-b546-…

icgc samtools cram

updated 11 days ago • Javier

Good morning, I aim to chop an already aligned bam file based on different regions of a gene as follow: samtools view -b -o out.bam -L regions.bed origin.bam samtools sort -o out.sort.bam...out.bam samtools index Then I want to convert the `out.bam` into two unaligned FASTQ files (each member of the read pair parsed to one of the two files) using SamToFastq: …

samtofasq picard validatesamfile

updated 11 days ago • Lila M

working with two lanes of Hi-C reads (2 forward, 2 reverse). Initially, I was merging the raw FASTQ files before mapping, but I've since been mapping each set of forward and reverse reads independently with BWA-MEM2 and then...merging the BAM files. I've also tried using only a single lane of forward and reverse reads. 2. Initially, I deviated from the VGP pipeline and

BWA-MEM2 Hi-C PretextMap

updated 11 days ago • Winter

Hello, how do I import a fastq file from my local windows computer into fluent terminal wls

Fastq

updated 11 days ago • oumo

base) I'm not sure how to interpret this - I assume that some internal python3 file is missing or not found. Possibly pycostat, given the SAT solver errors in installing and creating environments? conda...install -c conda-forge biopython # code to generate error /opt/homebrew/Caskroom/miniforge/base/lib/python3.10/site-packages/conda_package_streaming

biopython conda mamba pycosat

updated 11 days ago • kacollier

I have some very large WGS BCF files and I would to extract just the first 8 columns, thus reducing to just a 'sites-only' VCF/BCF. Does BCFTOOLS have a canned...I have some very large WGS BCF files and I would to extract just the first 8 columns, thus reducing to just a 'sites-only' VCF/BCF. Does BCFTOOLS have a canned option...t%REF\t%ALT\t%QUAL\t%FILTER\t%INFO\n" but I'm finding t…

sites-only bcftools

updated 11 days ago • Matthew

Hello, I'm annotating a vcf file using ensembl vep and gnomad v4 vcf file: vep --cache --offline --species homo_sapiens --assembly GRCh38 \ --input_file input.vcf...Hello, I'm annotating a vcf file using ensembl vep and gnomad v4 vcf file: vep --cache --offline --species homo_sapiens --assembly GRCh38 \ --input_file input.vcf \ --custom gnomad.vcf.b…

vep vcf gnomad

updated 11 days ago • asalimih

Hello all, Data: Paired end, RNASeq data. I had an issue with the featureCounts output Assigned reads are greater than the HISAT mapped on aligned concordantly exactly 1 time ``` From HISAT: aligned concordantly exactly 1 time is 48335140 From featureCounts summary: Assigned: 64074047 ``` Assigned value is 1.32 times greater than HISAT mapping results. It's weird that Assigned value is hig…

RNA-seq featureCounts HISAT

updated 11 days ago • Prawesh

Hi there, I am working on the de novo analysis of bacteria. This bacteria is related to Bacillus paranthracis, which was identified using rMLST, TYGS, and BLAST analysis. The genome assembly file was then provided to the KEGG-KASS tool. Although the prokaryotic dataset was selected for analysis, the identified KO...paranthracis, which was identified using rMLST, TYGS, and BLAST analysis. The ge…

KEGG-KASS WGS Pathway denovo

updated 11 days ago • mathavanbioinfo

Hello everyone, I'm currently working with VCF files of mutations from the TCGA dataset using the hg38 assembly. To further my analysis, I'm interested in comparing mutation

TCGA hg38 methylation Illumina

updated 11 days ago • elisheva

my current experience in IT financial system support got certification in bioinformatics and python/biopython

bio

updated 12 days ago • shehab

fetus (the one inherited from the mother by the fetus). The final results I'm looking for are a bam file with the reads from the unique chromosome of the mother, another with the reads from the unique chromosome of the child

long-reads phasing

updated 12 days ago • njornet

Hey everyone I need some help with SAMtools (v 1.3.1) and such. I have 2 files that I want to align, with the ultimate goal of understanding what percentage of the reference genome is covered by...Hey everyone I need some help with SAMtools (v 1.3.1) and such. I have 2 files that I want to align, with the ultimate goal of understanding what percentage of the reference genome is covered by…

SAMtools BWA alignment ddRAD

updated 12 days ago • Lemonhope

50,014 results • Page 2 of 1001

Recent Votes

Answer: Q: GenomeScope input and how to interpret the results

Answer: How to find tandem duplications pattern in a DNA sequence

A: How To Split One Big Sequence File Into Multiple Files With Less Than 1000 Seque

C: Snakemake vs. Nextflow: strengths and weaknesses

Answer: workflow management system : WDL, CWL, Ruffus, SnakeMake, etc

Sequence alignment on split read event such as inversion, duplication and complex nested events.

ICGEB - SLIBTEC NGS Workshop: Won Best Oral Presentation Award

Recent Locations • All